<!DOCTYPE html>
<html>

<head>
  <title>Prompt-Singer</title>
  <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet" />
  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
  <script src="helper.js" defer></script>
  <style>
    td {
      vertical-align: middle;
      min-width: 220px;
    }

    audio {
      height: 50px;
      width: 20vw;
      min-width: 70px;
      max-width: 220px;
    }
  </style>
</head>

<body>
  <div class="container pt-5 mt-5 shadow p-5 mb-5 bg-white rounded">
    <div class="text-center">
      <h1>Prompt-Singer</h1>
      <h3>Controllable Singing-Voice-Synthesis with Natural Language Prompt</h3>
      <p class="lead fw-bold">
          
          |<a
          href="https://aclanthology.org/2024.naacl-long.268/"
          class="btn border-white bg-white fw-bold"
          >Paper</a>|<a
            href="https://arxiv.org/abs/2403.11780"
            class="btn border-white bg-white fw-bold"
            >arXiv</a>|<a
            href="https://github.com/cyanbx/Prompt-Singer"
            class="btn border-white bg-white fw-bold"
            >Code</a>|
        </p>
      <p>
      

      </p>
      <!-- <p class="fst-italic mb-0">
          Eugene Kharitonov, Damien Vincent,  Zalán Borsos, Raphaël Marinier, Sertan Girgin, Olivier Pietquin,
          Matt Sharifi, Marco Tagliasacchi, Neil Zeghidour
        </p> -->
      <!-- <p><b>Anonymous Authors</b></p> -->
    </div>
    <p>
      <b>Abstract.</b>

      Recent singing-voice-synthesis (SVS) methods have achieved remarkable audio quality and naturalness, yet they lack the capability to control the style attributes of the synthesized singing explicitly. We propose Prompt-Singer, the first SVS method that enables attribute controlling on singer gender, vocal range and volume with natural language. We adopt a model architecture based on a decoder-only transformer with a multi-scale hierarchy, and design a range-melody decoupled pitch representation that enables text-conditioned vocal range control while keeping melodic accuracy. Furthermore, we explore various experiment settings, including different types of text representations, text encoder fine-tuning, and introducing speech data to alleviate data scarcity, aiming to facilitate further research. Experiments show that our model achieves favorable controlling ability and audio quality. Audio samples are available at <a href="http://prompt-singer.github.io">http://prompt-singer.github.io</a>.
    </p>
  </div>

  <div class="container pt-5 mt-5 shadow p-5 mb-5 bg-white rounded">
    <h2 id="model-overview" style="text-align: center;">Model Overview</h2>
    <div>
      <p><br /></p>
      <p style="text-align: center;">
        <img src="new_arch.png" height="200" width="1050" class="img-fluid">
      </p>
      <p><br /></p>
    </div>
    <p>
      The overall architecture of our model is illustrated in Figure (a). It is primarily composed of two sub-modules: 1) the multi-scale transformer, which generates discrete acoustic units conditioned on inputs of natural language prompt, lyrics with duration, and pitch information; and 2) the unit vocoder, which maps the generated acoustic units to an audio waveform.
    </p>
    <p>
      The multi-scale transformer serves as the backbone of our model. It is a decoder-only transformer with a hierarchical structure to facilitate the modeling of long sequences. This module aims to generate discrete acoustic units of singing voices conditioned on natural language prompts, lyrics phonemes, phoneme durations and vocal-range agnostic melody representation, together with the vocal-range factor as intermediate output. During training, the conditional inputs and target outputs are concatenated into a single sequence and fed to the transformer, which models the correlation using next-token-prediction with cross-entropy loss calculated on the target output part. During inference, the model predicts the range factor and acoustic units conditioned on the prefix input sequence autoregressively. When the acoustic unit generation finishes, the generated units are mapped to a high-fidelity audio waveform with the unit vocoder.
    </p>
  </div>

  <div class="container pt-5 mt-5 shadow p-5 mb-5 bg-white rounded">
    <h2 id="tablecontents" style="text-align: left;">Table of Contents</h2>
    <div>
        <li><a href="#gender" class="btn border-white bg-white fw-bold">Singer Gender Control</a></li>
        <li><a href="#pitch" class="btn border-white bg-white fw-bold">Vocal Range Control</a></li>
        <li><a href="#volume" class="btn border-white bg-white fw-bold">Volume Control</a></li>
        <li><a href="#multiple" class="btn border-white bg-white fw-bold">Multi-Attribute Control</a></li>
        <li><a href="#lowresource" class="btn border-white bg-white fw-bold">Low-Resource Results</a></li>
    </div>
  </div>

  <div class="container shadow p-5 mb-5 bg-white rounded">
    <h3>Singer Gender Control<a id="gender" /></h3>

    <p style="margin-top: 2em">
      In this section, we provide samples of prompted control over the singer gender. The results are from Prompt-Singer with finetuned FLAN-T5 large text encoder.
    </p>
    <div class="container pt-3 table-responsive">
      <table id="gender_gt_1">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 快乐时你不用分心想起我，难过时请一定记得联络我 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/0_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="gender_table_1">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">Would you give me a song sung by a male vocalist?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/0_male.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">I'm looking for a song with a woman singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/0_female.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="gender_gt_2">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 才不会让你替我受罪，婚礼上多喝几杯，和你现在那位 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/1_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="gender_table_2">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">Do you know any songs with a boy singer?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/1_male.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">I'm interested in a song with a lass vocalist, if possible.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/1_female.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="gender_gt_3">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 时光时光慢些吧 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/2_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      
      <table class="table table-hover" id="gender_table_3">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">I'm searching for a song featuring a guy singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/2_male.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">Can you compose a song performed by a female singer?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/2_female.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="gender_gt_4">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 风吹来的砂冥冥在哭泣，难道早就预言了分离 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/3_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="gender_table_4">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">I want to listen to a song with a man voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/3_male.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">I'm in the mood for a song performed by a madam artist.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/gender/3_female.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
    </div>

  </div>

  <div class="container shadow p-5 mb-5 bg-white rounded">
    <h3>Vocal Range Control<a id="pitch" /></h3>

    <p style="margin-top: 2em">
      In this section, we provide samples of prompted control over the vocal range. The results are from Prompt-Singer with finetuned FLAN-T5 large text encoder.
    </p>
    <div class="container pt-3 table-responsive">
      <table id="pitch_gt_1">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 快乐缺点勇气，浪漫缺点诗意，沉默一句一句都是谜题 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/1_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="pitch_table_1">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low (male)</td>
            <td style="text-align: center">Can you generate a guy singer's song with a deep pitch?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/1_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High (male)</td>
            <td style="text-align: center">Compose a man artist's song with a captivating high pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/1_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="pitch_gt_2">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 中古世纪的城市里，我想就走到这 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/2_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="pitch_table_2">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low (male)</td>
            <td style="text-align: center">Create a song with a bass pitch and man vocals.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/2_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High (male)</td>
            <td style="text-align: center">Design a boy voice's song with sharp harmony.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/2_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="pitch_gt_3">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 已经拥有你 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/3_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      
      <table class="table table-hover" id="pitch_table_3">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low (female)</td>
            <td style="text-align: center">Compose a deep pitch song with a female lead singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/3_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High (female)</td>
            <td style="text-align: center">Can you create a song with a girl voice and shrieking note?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/3_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="pitch_gt_4">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 才是考验，没意见你想怎样我都随便 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/4_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="pitch_table_4">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low (female)</td>
            <td style="text-align: center">Can you generate a miss singer's song with a low pitch?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/4_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High (female)</td>
            <td style="text-align: center">Design high-pitched harmonies with a woman vocalist.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/pitch/4_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
    </div>

  </div>

  <div class="container shadow p-5 mb-5 bg-white rounded">
    <h3>Volume Control<a id="volume" /></h3>

    <p style="margin-top: 2em">
      In this section, we provide samples of prompted control over the volume. The results are from Prompt-Singer with finetuned FLAN-T5 large text encoder.
    </p>
    <div class="container pt-3 table-responsive">
      <table id="volume_gt_1">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 从背后抱你的时候 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/1_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="volume_table_1">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low</td>
            <td style="text-align: center">Play me a song with a whispering voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/1_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Medium</td>
            <td style="text-align: center">Please give me a song with a voice that strikes a harmonious balance between gentleness and power.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/1_mid.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High</td>
            <td style="text-align: center">Give me a song with a deafening voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/1_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="volume_gt_2">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 欲望请放过脆弱的我 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/2_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="volume_table_2">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low</td>
            <td style="text-align: center">I need a song with a twittering voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/2_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Medium</td>
            <td style="text-align: center">I'd like to listen to a song with a middle-range voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/2_mid.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High</td>
            <td style="text-align: center">Give me a song with a roaring voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/2_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>
      <p>&nbsp</p>

      <table id="volume_gt_3">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 我不羡慕太阳，照不亮你过往 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/3_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      
      <table class="table table-hover" id="volume_table_3">
        <thead>
          <tr>
            <th style="text-align: center">Label</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Low</td>
            <td style="text-align: center">Design a song with a quiet voice, gently whispering lyrics to my soul.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/3_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Medium</td>
            <td style="text-align: center">I'm interested in a song with a moderate voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/3_mid.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">High</td>
            <td style="text-align: center">Synthesize a song with a booming voice for me.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/volume/3_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>

    </div>

  </div>

  <div class="container shadow p-5 mb-5 bg-white rounded">
    <h3>Multiple-Attribute Control<a id="multiple" /></h3>

    <p style="margin-top: 2em">
      In this section, we provide samples of prompted control over multiple attributes. The results are from Prompt-Singer with finetuned FLAN-T5 large text encoder.
    </p>
    <div class="container pt-3 table-responsive">
      <table id="multiple_gt_0">
        <tr height=100px>
          <td colspan="3" style="text-align: center"><b>Lyrics:</b> 在世上，命运不能更改 &nbsp &nbsp <b>Reference Singing:</b> &nbsp <audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
      </tr>
      </table>
      <table class="table table-hover" id="gender_table_1">
        <thead>
          <tr>
            <th style="text-align: center">Labels</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Generated Singing</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Female, Low Volume, Low Pitch</td>
            <td style="text-align: center">Generate a female singer with a whispering voice to compose a song in a low pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_low_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, Low Volume, High Pitch</td>
            <td style="text-align: center">Can you produce a melody featuring a treble pitch and miss voice with a slight sound?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_low_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, Medium Volume, Low Pitch</td>
            <td style="text-align: center">Compose a song with a low-pitched pitch and woman artist featuring a moderate vocal style.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_mid_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, Medium Volume, High Pitch</td>
            <td style="text-align: center">Can you make a song in a sharp key? Need a miss singer with a moderate voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_mid_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, High Volume, Low Pitch</td>
            <td style="text-align: center">Generate a lady singer's song with a loud voice that thunders, with a bass pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_high_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, High Volume, High Pitch</td>
            <td style="text-align: center">Create a high-pitched centered song with booming voice and girl singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_female_high_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, Low Volume, Low Pitch</td>
            <td style="text-align: center">Synthesize a boy singer's song with a whispering sound at thick level.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_low_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, Low Volume, High Pitch</td>
            <td style="text-align: center">Can you create a song with a twittering voice, and if possible, a man voice, that has a distinctive shrill sound?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_low_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, Medium Volume, Low Pitch</td>
            <td style="text-align: center">Create a gentleman singer with moderate vocals and beautiful bass harmonies.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_mid_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, Medium Volume, High Pitch</td>
            <td style="text-align: center">Generate a sir singer's song with a intermediate voice and shrieking harmony.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_mid_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, High Volume, Low Pitch</td>
            <td style="text-align: center">Make a thick pitch song by a man singer with a ringing voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_high_low.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, High Volume, High Pitch</td>
            <td style="text-align: center">Synthesize a song with a unique shrieking tone and a thunderous voice, preferably with a male singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/multiple/0_male_high_high.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
        </tbody>
      </table>

  </div>
  </div>

  <div class="container pt-5 mt-5 shadow p-5 mb-5 bg-white rounded">
    <h3 id="tablecontents" style="text-align: left;">Low-Resource Results<a id="lowresource" /></h3>
    <p style="margin-top: 2em">
      We first provide some samples, where <b>the model incorporating speech data</b> demonstrates <b>superior control capabilities</b>, while <b>the model using solely singing data fails to control attributes</b> or the characteristics are <b>not significant</b>.
    </p>
    <div class="container pt-3 table-responsive">
      <table class="table table-hover" id="lowresource_table1">
        <thead>
          <tr>
            <th style="text-align: center">Labels</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Speech + Singing</th>
            <th style="text-align: center">Singing Only</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">Do you have any songs with a male singer?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">I'm interested in a song with a woman vocalist.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, Low Pitch</td>
            <td style="text-align: center">Design a bass song performed by a guy singer.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_low_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_low_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Male, High Pitch</td>
            <td style="text-align: center">Can you generate a guy singer's song with a high pitch?</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_high_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/male_high_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, Low Pitch</td>
            <td style="text-align: center">Create a woman vocalist with bass pitch for an emotional song.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_low_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_low_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center">Female, High Pitch</td>
            <td style="text-align: center">Synthesize a song with female vocalist and a sharp pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_high_speech.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource1/female_high_sing.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
      </table>
    </div>
    
    <p style="margin-top: 2em">
      Next, we present samples obtained by combining different quantities of low-resource singing data with speech data. Pay attention to the quality and melodic accuracy of the synthesized singing. <b>(You may need to scroll right to see full results.)</b>
    </p>

    <div class="container pt-3 table-responsive">
      <table class="table table-hover" id="lowresource_table2">
        <thead>
          <tr>
            <th style="text-align: center">Ref Singing</th>
            <th style="text-align: center">Labels</th>
            <th style="text-align: center">Prompt</th>
            <th style="text-align: center">Singing 10min + Speech 100h</th>
            <th style="text-align: center">Singing 1h + Speech 100h</th>
            <th style="text-align: center">Singing 10h + Speech 100h</th>
            <th style="text-align: center">Singing 100h + Speech 100h</th>
          </tr>
        </thead>
        <tbody>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Female</td>
            <td style="text-align: center">I need a song with a female lead singer. </td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Male</td>
            <td style="text-align: center">I want to listen to a song with a guy voice, if possible. </td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_high_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Female, High Pitch</td>
            <td style="text-align: center">Compose a song with a female voice and its unique charm in its treble pitch. </td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_high_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_high_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_high_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_high_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_low_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Female, Low Pitch</td>
            <td style="text-align: center">Creating a song with a lass vocalist and a distinct use of thick pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_low_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_low_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_low_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/female_low_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_high_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Male, High Pitch</td>
            <td style="text-align: center">Can you create a song featuring a man vocalist and emphasizing the shrill note? </td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_high_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_high_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_high_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_high_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_low_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Male, Low Pitch</td>
            <td style="text-align: center">Composing a gentleman singer's song with a deep pitch.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_low_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_low_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_low_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/male_low_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_low_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Low Volume</td>
            <td style="text-align: center">Play me a song with a hushed voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_low_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_low_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_low_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_low_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_mid_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">Medium Volume</td>
            <td style="text-align: center">I'd like to listen to a song with a moderate voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_mid_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_mid_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_mid_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_mid_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
          <tr height=100px>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_high_gt.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center">High Volume</td>
            <td style="text-align: center">Give me a song with a roaring voice.</td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_high_10min.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_high_1h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_high_10h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
            <td style="text-align: center"><audio controls controlslist="nodownload" class="px-1"> <source src='data/lowresource2/volume_high_100h.wav' type="audio/wav">Your browser does not support the audio element.</audio></td>
          </tr>
      </table>
    </div>
    

  </div>

</body>

</html>